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- The MAILING DATE of this communication appears on the cover sheet with the correspondence address - 
Period for Reply 

A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 .136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- If the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely. 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 
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- Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1.704(b). 

Status 

1 )E3 Responsive to communication(s) filed on 26 December 2000 . 
2a)D This action is FINAL. 2b)^ This action is non-final. 

3) D Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 1 1 , 453 O.G. 213. 
Disposition of Claims 

4) E3 Claim(s) 1-21 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) Q Claim(s) is/are allowed. 

6) ^ Claim(s) 1-21 is/are rejected. 

7) D Claim(s) is/are objected to. 

8) D Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) ^ The specification is objected to by the Examiner. 

10)^ The drawing(s) filed on 26 December 2000 is/are: a)D accepted or b)^ objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1 .85(a). 
1 1 )0 The proposed drawing correction filed on is: a)Q approved b)D disapproved by the Examiner. 

If approved, corrected drawings are required in reply to this Office action. 

12) D The oath or declaration is objected to by the Examiner. 
Priority under 35 U.S.C. §§119 and 120 

13) D Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 1 19(a)-(d) or (f). 

a)D All b)D Some*c)D None of: 

1 .□ Certified copies of the priority documents have been received. 

2. Q Certified copies of the priority documents have been received in Application No. . 

3. D Copies of the certified copies of the priority documents have been received in this National Stage 

application from the International Bureau (PCT Rule 17.2(a)). 
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DETAILED ACTION 



(Note that as of October 1 , 2002 a new Art Unit 2655 was established that includes this 
application, and that this new AU number should be used in all future correspondence.) 



1 . The drawings are objected to as failing to comply with 37 CFR 1 .84(p)(5) 
because they include the following reference sign(s) not mentioned in the description: 

- "195" in figure 1 is not described in the specification. 

2. The drawings are objected to because 

- Item "306" is labeled twice (in figure 3). 

- Item "402" (figure 4) is not labeled. 

3. A proposed drawing correction, corrected drawings, or amendment to the 
specification to add the reference sign(s) in the description, are required in reply to the 
Office action to avoid abandonment of the application. The objection to the drawings will 
not be held in abeyance. 



Drawings 
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Specification 



4. The lengthy specification has not been checked to the extent necessary to 
determine the presence of all possible minor errors. Applicant's cooperation is 
requested in correcting any errors of which applicant may become aware in the 
specification, such as: 

- The reference at the end of line 31 (on page 9) should be "195". 

- "A" should be "an" (3 rd line claim 18, page 33). 



5. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(a) the invention was known or used by others in this country, or patented or described in a printed 
publication in this or a foreign country, before the invention thereof by the applicant for a patent. 

Smith et al 

6. Claims 1-4 are rejected under 35 U.S.C. 102(a) as being anticipated by Smith et 
al (U.S. Patent 6,408,271 B). 

7. Regarding claim 1 , the features employed by Smith et al in generating phrasal 
transcriptions for speech recognition dictionaries by permutating word transcriptions for 
each vocabulary item in an orthographic phrase read on the features of the method for 



Claim Rejections - 35 USC § 102 



Application/Control Number: 09/748,453 Page 4 

Art Unit: 2655 

adding an acoustic description of a word to a speech recognition lexicon of the 
immediate application as follows: 

- Smith et al (column 6 lines 15-20) reads on the feature of converting the text of 
the word into at least one orthographically derived acoustic description of the 
word; 

- Smith et al (column 6 lines 42-46) reads on the feature of generating a score for 
an orthographically derived acoustic description based in part on a comparison 
between the orthographically derived acoustic description and a speech signal 
representing a user's pronunciation of the word; 

- Smith et al (with generating steps 202 & 302 in figures 2 & 3) reads on the 
feature of decoding the speech signal 804 in figure 8) representing the user's 
pronunciation of the word to produce a decoded acoustic description of the word 
and a score for the decoded acoustic description; and 

- Smith et al (column 12 lines 26-37) reads on the feature of selecting one of the 
orthographically derived acoustic description and the decoded acoustic 
description as the acoustic description of the word based on the score for the 
orthographically (column 1 2 lines 30-31 ) derived acoustic description and the 
score for the decoded acoustic description (column 1 2 lines 35-37). 

8. Regarding claim 2, the claim is set forth with the same limits as claim 1 . 
Smith et al (column 13 lines 53-56) reads on the feature of generating an acoustic 
model score. 
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9. Regarding claim 3, the claim is set forth with the same limits as claim 2. 
Smith et a I (column 13 lines 57-60) reads on the feature of generating an acoustic 
model score for at least one decoded acoustic description and using the score as at 
least part of the score for the decoded acoustic description. 

10. Regarding claim 4, the claim is set forth with the same limits as claim 3. 
Smith et a I (802 in figure 8) reads on the feature of using the same acoustic model 
(specified by "a speech model set", column 13 line 52) to generate both acoustic model 
scores (lines 46-56). 

Gupta et al 

11. Claim 12-17 are rejected under 35 U.S.C. 102(a) as being anticipated by Gupta 
et al (U.S. Patent 6,243,680 B1 ). 

1 2. Regarding claim 1 2, the apparatus of Gupta et al for obtaining a transcription of 
phrases through text and spoken utterances relates to the features for a computer- 
readable medium of the immediate application as follows: 

- Gupta et al (column 1 lines 56-57) reads on the feature of receiving text of a word 
for which a phonetic description is to be added to a speech recognition lexicon 
(line 56) and on the feature of receiving a representation of a speech signal 
produced by a person pronouncing the word (line 57); 
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- Gupta et al (412 -> 400 in figure 4) reads on the feature of converting the text of 
the word into a text-based phonetic description of the word; 

- Gupta et al (402 -> 404 in figure 4) reads on the feature of generating a speech- 
based phonetic description of the word from the representation of the speech 
signal; and 

- Gupta et al (406 in figure 4) reads on the feature of selecting a phonetic 
description of the word to add to the speech recognition lexicon by selecting 
between the text-based phonetic description and the speech-based phonetic 
description based in part on the correspondence between each phonetic 
description and the representation of the speech signal. 

13. Regarding claim 13, the claim is set forth with the same limits as claim 12. 
Gupta et a I (column 7 lines 29-33) reads on the feature of generating a plurality of 
possible phonetic descriptions, using at least one model (column 4 lines 19-21 ) to score 
each possible phonetic description (column 5 lines 3-8) and selecting the possible 
phonetic description with the highest score as the speech-based phonetic description 
(column 5 line 16-18). 

14. Regarding claim 14, the claim is set forth with the same limits as claim 13. 
Gupta et al (column 9 lines 47-62) reads on the feature of using an acoustic model (of 
allophones, column 9 line 59) and a language model (using linguistic rules, column 9 
line 38). 



Application/Control Number: 09/748,453 



Art Unit: 2655 



Page 7 



15. Regarding claim 15, the claim is set forth with the same limits as claim 14. 
Gupta et a I reads on the feature of using a language model comprises using a language 
model that is based on syllable-like units (with the sub-word units of column 9 line 62). 

16. Regarding claim 16, the claim is set forth with the same limits as claim 15. 
Gupta et a I (column 10 lines 6-7) reads on the feature of generating acoustic model 
scores for each of the phonemes in a syllable-like unit & (in column 10 lines 15-18) 
summing the acoustic model scores of the phonemes to generate an acoustic model 
score for the syllable-like unit 

1 7. Regarding claim 1 7, the claim is set forth with the same limits as claim 1 2. 

- Gupta et a I (column 10 lines 64-66) reads on the feature of generating a score for 
the text-based phonetic description based on the correspondence (column 1 1 
lines 29-31 ) between the text based phonetic description and the representation 
of the speech signal; 

- Gupta et a I (column 12 lines 10-16) reads on the feature of generating a score for 
the speech-based phonetic description based on the correspondence between 
the speech-based phonetic description and the representation of the speech 
signal 

- Gupta et a I (column 14 lines 24-27) reads on the feature of selecting the phonetic 
description with the highest score. 
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Claim Rejections - 35 USC § 103 



1 8. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 



19. This application currently names joint inventors. In considering patentability of 
the claims under 35 U.S.C. 103(a), the examiner presumes that the subject matter of 
the various claims was commonly owned at the time any inventions covered therein 
were made absent any evidence to the contrary. Applicant is advised of the obligation 
under 37 CFR 1 .56 to point out the inventor and invention dates of each claim that was 
not commonly owned at the time a later invention was made in order for the examiner to 
consider the applicability of 35 U.S.C. 103(c) and potential 35 U.S.C. 102(e), (f) or (g) 
prior art under 35 U.S.C. 103(a). 

Smith efa/& Bahl efa/' 426 

20. Claim 5 is rejected under 35 U.S.C. 103(a) as being unpatentable over Smith et 
al in view of Bahl et a/ 426 (U.S. Patent 5,875,426). 
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21 . Regarding claim 5, the claim is set forth with the same limits as claim 3. 
Smith et al does not teach generating a language model score. The Bahl et a/ 426 
method/system for recognizing speech having word liaisons by adding a phoneme to 
reference word models (column 3 lines 55-60) reads on the feature of generating a 
language model score for the at least one decoded acoustic description and (lines 58- 
59) using the language model score as part of the score for the at least one decoded 
acoustic description. 

It would have been obvious to a person of ordinary skill in the art of speech 
signal processing at the time of the invention to apply the method/teachings of Bahl et 
al 426 to the device/method of Smith ef al so as to consider context among the bases of 
making an acoustic decision. 

Smith et a/, Bahl et a/ 426 & Bahl et a/ 921 

22. Claims 6 - 8 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Smith ef al in view of Bahl et a/ 426 and further in view of Bahl et a/ 921 (U.S. Patent 
6,377,921). 

23. Regarding claim 6, the claim is set forth with the same limits as claim 5. 
Smith et al does not teach generating a language model score. The Bahl et a/ 921 
method/system for identifying mismatches between assumed and actual pronunciations 
of words (column 2 lines 30-38) reads on the feature of generating an acoustic model 
score and a language model score (lines 39-40) for a sequence of syllable-like units and 
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(with lines 55-61 ) the further feature that the decoded acoustic description is derived 
from the sequence of syllable-like units. 

It would have been obvious to a person of ordinary skill in the art of speech 
signal processing at the time of the invention to apply the method/teachings of Bahl et 
a/ 921 to the device/method of Smith et al so as to increase precision and avoid prosodic 
differences by addressing the lower cohesive elements of speech. 

24. Regarding claim 7, the claim is set forth with the same limits as claim 6. 
Smith et al does not teach generating a language model score. Bahl et a/ 921 (with the 
"phones" of column 6 line 13) reads on the feature of dividing the sequence of syllable- 
like units into a sequence of phonemes, which would have made it obvious to a person 
of ordinary skill in the art of speech signal processing at the time of the invention to 
apply the method/teachings of Bahl et a/ 921 to the device/method of Smith et al so as to 
not to overlook minor utterances by considering each potential word segment 
separately. 

25. Regarding claim 8, the claim is set forth with the same limits as claim 6. 

Smith et al does not teach generating a language model score. Bahl et a/ 426 (column 3 
lines 51-53) reads on the feature of generating a language model score based on a 
trigram language model for syllable-like units, which would have made it obvious to a 
person of ordinary skill in the art of speech signal processing at the time of the invention 
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to apply the method/teachings of Bahl et a/ 426 to the device/method of Smith et al so as 
to more quickly isolate candidates from combinations of segments. 

Smith et a/, Bahl et al 426 & Contolini et al 

26. Claims 9 - 1 1 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Smith et al in view of Bahl et a/ 426 and further in view of Contolini et al (U.S. Patent 
6,233,53 B1). 

27. Regarding claim 9, the claim is set forth with the same limits as claim 6. 
Smith et al does not teach generating a language model score. Contolini et a/ , in the 
method and system for automatically determining phonetic transcriptions associated 
with spelled words, reads on the feature of generating acoustic model (of claim 4 
limiting by claim 1) scores for each of a sequence of phonemes (column 7 line 6) that 
form the sequence of syllable-like units (column 6 line 56). 

It would have been obvious to a person of ordinary skill in the art of speech 
signal processing at the time of the invention to apply the method/teachings of Contolini 
et al to the device/method of Smith et al so as to be able to relate the results of the 
recognition that might require correction to those elements that would be familiar to the 
speaker. 



28. Regarding claim 1 0, the claim is set forth with the same limits as claim 1 . 
Smith et al does not specify the product reaching a state accessible for human 
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intervention. Contolini ef a/ does so permit such adjustments, with (figure 2) reading on 
the feature of displaying a user interface comprising an edit box (item 35) in which a 
user may enter the text of the word (as according to the 1 st lines of the Abstract) and a 
list box (item 34) that displays words for which an acoustic description has been 
previously added to the speech recognition lexicon. 

It would have been obvious to a person of ordinary skill in the art of speech 
signal processing at the time of the invention to apply the method/teachings of Contolini 
et al to the device/method of Smith ef al so as to permit refinements that recognize 
exceptions to the rules used to set up the vocabulary. 

29. Regarding claim 1 1 , the claim is set forth with the same limits as claim 10. 
Smith ef al does not specify the product reaching a state accessible for human 
intervention. 

- Contolini ef al (figure 2 & column 4 lines 1 7-25) reads on the features of receiving 
an indication that a user has selected a word in the list box (line 22); 

- Contolini et al (column 5 lines 55-56) reads on the features of retrieving the 
added acoustic description of the word from the speech recognition lexicon and 
converting the retrieved acoustic description into an audible signal. 

- It would have been obvious to a person of ordinary skill in the art of speech 
signal processing at the time of the invention to apply the method/teachings of 
Contolini ef al to the device/method of Smith ef al so as to audibly confirm the 
validity of the revision. 
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Gupta et al & Contolini et al 

30. Claim 18 is rejected under 35 U.S.C. 103(a) as being unpatentable over Gupta et 
a[ in view of Contolini et al . 

31 . Gupta et al does not specify the product reaching a state accessible for human 
intervention, so does produce audible pronunciations. 

- Contolini et al (by selecting the speaker icon at the left of figure 2) reads on the 
feature of receiving an instruction to generate a audible pronunciation of a 
phonetic description previously added to the speech recognition lexicon, 

- Contolini et al (column 4 line 52-56) reads on the feature of retrieving the added 
phonetic description from the speech recognition lexicon, causing an audible 
pronunciation to be generated based on the retrieved phonetic description. 

- It would have been obvious to a person of ordinary skill in the art of speech 
signal processing at the time of the invention to apply the method/teachings of 
Contolini et al to the device/method of Gupta et al so as to evaluate generated 
speech. 

Schultze & Gupta et al 

32. Claims 19-21 are rejected under 35 U.S.C. 103(a) as being anticipated by 
Schultze (U.S. Patent 6,167,369 A) in view of Gupta et a/ . 
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33. Regarding claim 1 9, the features of the automatic language identification using 
both N-gram and word information of Schultze reads on the speech recognition system 
having a language model generated through a process of the immediate application as 
follows: 

- Where Schultze does not specifically mention breaking each word into syllable- 
like units, Gupta et al reads on the feature of breaking each word in a dictionary 
into syllable-like unit (with the sub-word units of column 9 line 62). Schultze 
(column 1 line 29) then reads on the further feature of for each word, grouping 
the syllable-like units of the word into n-grams; 

- Schultze (column 12 lines 21-22) reads on the feature of counting the total 
number of n-gram occurrences in the dictionary; 

- Schultze (column 12 lines 40-41 ) reads on the feature of for each n-gram, 
counting the number of occurrences of the n-gram in the dictionary and dividing 
this count by the total number of n-gram occurrences to form a language model 
probability for the n-gram. 

- It would have been obvious to a person of ordinary skill in the art of speech 
signal processing at the time of the invention to apply the method/teachings of 
Gupta et al to the device/method of Schultze so as to separate the contiguous 
signal into discrete portions corresponding to the dictionary for match processing. 



34. Regarding claim 20, the claim is set forth with the same limits as claim 19. 
Schultze (column 12 lines 35-37) reads on the feature of breaking the words by 
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preferring syllable like units that occur more frequently in the dictionary over syllable-like 
units that occur less frequently. 

35. Regarding claim 21 , the claim is set forth with the same limits as claim 20. 
Schultze (column 12 line 40) reads on the feature of updating the frequencies of the 
syllable-like units into which the word is broken. 

Conclusion 

36. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

- Fabiani et al (U.S. Patent Application Publication 2002/0173945 A1) generates 
multilingual transcription groups by mapping models against dictionaries. 

- Hwang et al (U.S. Patent Application Publication 2002/0082831 A1 ) adds phonetic 
descriptions to a speech recognition lexicon. 

- Schoofs et al (U.S. Patent 6,487,532 B1 ) uses language models to distinguish 
homophones. 

- Hab-Umbach et al (U.S. Patent 5,873,061 A) adds words to the speech recognition 
system word model database. 

- Kimura ("100000-Word Recognition Using Acoustic-Segment Networks", 
International Conference on Acoustics, Speech, and Signal Processing, April 1990) 
incorporates both language and voice models. 
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- Eichner et al ("Data-Driven Generation Of Pronunciation Dictionaries In The German 
Verbmobil Project: Discussion Of Experimental Results", International Conference 
on Acoustics, Speech, and Signal Processing, June 2000) uses language models to 
correct pronunciations. 

- Sabourin (U.S. Patent 6,208,964 B1) unsupervised adaptation of transcriptions relies 
on language model rules. 

- Lee (U.S. Patent 6,067,520 A) continuous Mandarin speech recognition breaks 
speech into sub-syllabic elements for Chinese HMM. 

- Beattie et al (U.S. Patent 5,865,626 A) multi-dialect speech recognition uses 
modeling to determine language. 

- Nishimura et al (U.S. Patent 5,502,791 A) calculates probabilities in syllable-like 
speech segments. 

- Sharman (U.S. Patent 6,363,342 B2) recognizes syllable-like segments of speech. 

- Shaw et al (U.S. Patent 5,949,961 ) combines speech with text to correct 
pronunciation. 

37. Any inquiry concerning this communication or earlier communications from the 
Examiner should be directed to Daniel A. Nolan at telephone (703) 305-1368 whose 
normal business hours are Mon, Tue, Thu & Fri, from 7 AM to 5 PM. 

If attempts to contact the examiner by telephone are unsuccessful, the 
examiner's supervisor, Doris To, can be reached at (703) 305-4827. 
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The fax phone number for Technology Center 2600 is (703) 872-9314. Label 
informal and draft communications as "DRAFT" or "PROPOSED", & designate formal 
communications as "EXPEDITED PROCEDURE". 



Formal response to this action may be faxed according to the above instructions, 
or mailed to: 

Commissioner of Patents and Trademarks 
Washington, D.C. 20231 

or hand-delivered to: Crystal Park 2, 

2121 Crystal Drive, Arlington, VA, 
Sixth Floor (Receptionist). 

Any inquiry of a general nature or relating to the status of this application or 

proceeding should be directed to Technology Center 2600 Customer Service Office at 

telephone number (703) 306-0377. 



Daniel A. Nolan 
Examiner 
Art Unit 2655 

DAN/d 

March 1 , 2003 





DANIEL NOLAN 
RSTENT EXAMINER 



